Guideline: Guidelines On Technology Capacity Management Techniques

Technology Capacity Management Techniques

Forecasting And Modelling

The prime objective of capacity management is to predict the behaviour of IT services under a given volume and variety of work. Technology capacity forecasting is the process of modelling and forecasting the technology capacity required by an engagement, to meet the demands of the IT services. Forecasting reduces the risk of technology capacity-related performance and availability issues. This method helps to understand the uncertainty of the future by relying mainly on data from the past and present, and analysis of trends.

To provide a solid forecast the future business development and its impact upon the level of resource utilization for the relevant service components, needs to be identified. By comparing these requirements against the current levels of resource utilization (which can be obtained from an accurate CMDB), it should be possible not only to generate a forecast, but also provide propositions to meet business requirements. Agreeing on which propositions should be implemented and setting priorities, should be done with the customer as these involve financial and environmental consequences.

Modelling tells about what workload can be supported with given resources or what service can be provided for a given workload. Workload is the amount of a resource use in a certain period. It usually indicates the throughput of work for certain group of users or functions in an engagement. The first stage in modelling is to create a baseline model that reflects accurately the performance that is currently being achieved. When this baseline model has been created, predictive modelling can be done, i.e. ask the ‘What if?’ questions that reflect failures, planned changes to the hardware and/or the volume/variety of workloads. If the baseline model is accurate, then the accuracy of the result of the potential failures and changes can be trusted.

There are different types of capacity modelling techniques from making estimates based on information on current resource utilization and experience to making prototypes, full scale benchmarks and pilot studies. These techniques have their good and bad sides and are suitable for different scenarios. With all types of modelling, similar levels of accuracy can be obtained but all are totally dependent on the skill of the person constructing the model and the information used to create it. The three most popular capacity modeling techniques are trending, simulation modelling and analytical modelling.

Trending

Trending, also known as a trend analysis, is a modeling technique where historical data about resource utilization and service performance is used to forecast future behavior. The historical data is usually analyzed in a spreadsheet where graphical, trending and forecasting facilities are used to show resource utilization over a time and how it is likely to change in the future.

For the things that behave in straight lines trending provides a sufficient analysis. In practice, there are some cases where trending is a viable approach and in some other cases it is not. There are two key points that decide if it is a good approach. The first one is that there should not be any unknown discontinuities. The second one is that the underlying attributes of a system should be linear.

Trending is most effective when there are a small number of variables and there is a linear relationship between them. It is quite affordable modelling technique, but only provides estimates on future resource utilization. Trend analysis is less effective in producing an accurate estimate of response times, in which case either analytical or simulation modelling should be used.

Simulation Modelling

Simulation modelling is a technique in which simulation models are based on computer programs that emulate static structure and the different dynamic aspects of a system. Simulation modelling is used before making decisions on load allocation.

Simulation involves the modelling of discrete events (for example, transaction arrival rates) against a given hardware configuration. This is the reason it is also known as discrete event simulation. This type of modelling can be very accurate in sizing new applications or predicting the effects of changes on existing applications. Downside of this technique is that it often takes a long time to build and execute the model and therefore it is costly.

While simulating transaction arrival rates, two approach can be followed to input the data. Either staff can be used to enter a series of transactions from prepared scripts or software is used to directly input the same scripted transactions with a random arrival rate. Either of these approaches takes time and effort to prepare and run. However, it can be cost-justified for large engagements where the major cost and the associated performance implications assume great importance.

If the response time predicted by the model is sufficiently close to the response times in the actual production environment, then this model is effective. As this model is quick to build and uses little resources compared to Analytical modelling the results are satisfactory.

Analytical Modelling

Analytical models are constructed and used by capacity planners to predict computing resource requirements related to workload behaviour and volume changes. These are mathematical models that have a closed form solution, i.e. the solution to the equations used to describe changes in a system can be expressed as a mathematical analytic function. Analytical models are representations of the behaviour of computer systems using mathematical techniques – for example, multi-class network queuing theory. Analytical models are used to assess current performance and predict future performance.

Typically, a model is built using software packages by specifying within the package the components and structure of the configuration that need to be modelled, and the utilization of the components – for example, processor, memory and disks – by the various workloads or applications. When the model is run, the queuing theory is used to calculate the response times in the computer system. If the response times predicted by the model are sufficiently close to the response times recorded in real life, the model can be regarded as an accurate representation of the computer system.

To be mathematically flexible, analytical models include usually a little detail and therefore they tend to be efficient to run, but not so accurate as other modelling techniques. This model does not take much time to create but must be updated frequently.

Sizing

System Sizing

Based on the information, received from technology capacity management, sizing of the IT Infrastructure and Organisation to support the agreed service can be undertaken. Sizing should be undertaken together with specialists to understand the IT Components, engagement management to understand the KPI aspects and service delivery management to understand the resource aspects.

Any system must be sized to respond adequately during peak demand which differs from time to time. To effectively estimate capacity requirements, engagements must identify the demand period for all resources at unplanned peak periods like multiple users accessing a single site at same time, increased load during morning time, etc. There are two operational states based on the load of a production system,

Green Zone – State at which the system is operating under the normal load conditions. A system operating at this range must be able to sustain response times within the acceptable latency and service level targets.

Red Zone – State at which the load is greater than the normal peak load, but can still provide service for a limited period of time. In this state, there is a high chance of failures happening due to bottlenecks. The ultimate goal must be to design the system accordingly to deploy an environment that can consistently support Red Zone load without service failure and within acceptable latency and throughput targets.

Application Sizing

Application sizing must be used to estimate the resource requirements to support a proposed change to an existing service or the implementation of a new service - to ensure that it meets its required service levels. To achieve this, application sizing must be an integral part of the service lifecycle.

Application sizing has a finite lifespan. It is initiated at the design stage for a new service, or when there is a major change to an existing service, and is completed when the application is accepted into the live operational environment. Sizing activities should include all areas of technology related to the applications, infrastructure, environment and data. Sizing activities is done using modelling and trending techniques.

During the initial requirements and design, the required service levels must be specified in the service level requirements. This enables the service design and development to employ the pertinent technologies and products to achieve a design that meets the desired levels of service. It is much easier and less expensive to achieve the required service levels if service design considers the required service levels at the very beginning of the service lifecycle, rather than at some later stage. Other considerations in application sizing are the resilience aspects required for the design of new services. Technology Capacity Management must provide advice and guidance to the Technology Availability Management process on the resources required to provide the required level of performance and resilience.

The sizing of the application should be refined as design and development progress. Modelling can be used during application sizing. The resources to be utilized by the application are likely to be shared with other services, and potential threats to existing SLA targets must be recognized and managed.

Tuning

The analysis of the monitored data must identify areas of the configuration that could be tuned to better utilize the service, system and component resources or improve the performance of the particular service.

Tuning techniques that are of assistance include:

Balancing workloads and traffic: Transactions may arrive at the host or server at a particular gateway, depending on where the transaction was initiated; balancing the ratio of initiation points to gateways.
Balancing disk traffic: Storing data on disk efficiently and strategically. For example, striping data across many spindles can reduce data contention.
Definition of an accepted locking strategy: This specifies when locks are necessary at the appropriate levels. For example, database, page, file, record and row. Delaying the lock until an update is necessary can provide required benefits.
Efficient use of memory: This includes utilizing memory depending on the circumstances.

Before implementing any of the recommendations arising from the tuning techniques, it must be appropriate to consider testing the validity of the recommendation

Implementation of recommendations: The objective of this activity is to introduce to the live operation services the changes that have been identified by the monitoring, analysis and tuning activities. The implementation of any changes arising from these activities must be undertaken through a strict, formal change management process. The impact of system tuning changes can have major implications on the customer service. The impact and risk associated with these types of change is likely to be greater than that of other types of change.